January 10, 2018
Intro to geospatial data
Intro to coordinate reference systems (CRS)
Classes & methods for working with spatial data in R
Mapping geospatial data
Practice
Patty Frontiera
Who are you?
Why are you here?
https://github.com/dlab-geo/r-geospatial-workshop
Download the zip file
Open RStudio and start a new script
Follow along by opening r-geospatial-workshop-pt1-sp2018.Rmd in your web browser
install.packages(
c("sp","rgdal","tmap","classInt","RColorBrewer",
"ggplot2","leaflet"), dependencies=TRUE
)
Data about locations on or near the surface of the Earth.
![]()
Convey geographic information but don't specify location on the surface of the Earth.
![]()
represent location geometrically with coordinates
46.130479, -117.134167

Coordinates indicate specific locations on the Earth when associated with a geographic coordinate reference system or CRS.
![]()
Specifies
Because of variations in 1-3, there are many geographic CRSs!
The World Geodetic System of 1984 is the default geographic CRS used today.
A Projected CRS applies a map projection to a Geographic CRS
Map Projection: mathematial transformation from curved to flat surface

There are many, many, many projected CRSs
All introduce distortion, eg in shape, area, distance, direction
No best one for all purposes
Selection depends on location, extent and purpose

Spatial data is a more generic term that is not just for geographic data.
Spatial data are powerful because
Points, lines and Polygons

Regular grid of cells (or pixels)

Most software can only query and analyze numbers and text
Geospatial data require software that can import, create, store, edit, visualize and analyze locations represented geometrically as coordinates referenced to the surface of the earth.
Software that can import, create, store, edit, visualize and analyze locations represented geometrically as coordinates georeferenced to the surface of the earth.
A geospatial database of information collected for a particular purpose, usually to answer specific questions.
Desktop GIS - ArcGIS, QGIS
Web GIS - ArcGIS Online CARTO
Spatial Databases - Postgresql
Custom software - leaflet web maps
You already use R
Reproducibility
Free & Open Source
Cutting edge
Thousands of Cool add-ons
Geodata can be stored in many different types of files.
The simplest is point data store in a CSV file.
"ID","name”,”x”,”y”,”taste","price","crowded","food" "1","babette",-122.255374,37.868428,10,4,1,1 "2","musical",-122.260698,37.868383,7,3.25,1,1 "3","starbucks",-122.266057,37.870441,6,2.95,1,0 "4","yalis",-122.266385,37.873528,7,2.95,0,0 "5","berkeleyesp",-122.268681,37.873664,3,3.25,1,0 "6","fertile",-122.268863,37.874934,5,3.25,0,1 "7","guerilla",-122.269206,37.877949,9,3.5,1,1 "8","philz",-122.269221,37.878093,10,4,1,0 "9","nefeli",-122.2603,37.875417,6,3.2,0,1 "10","goldenbear",-122.259613,37.869632,1,2,1,0
There are many approaches to and packages for working with geospatial data in R.
One approach is to keep it simple and store geospatial data in a data frame
sfhomes <- read.csv('data/sf_properties.csv')
head(sfhomes,6)
## FiscalYear SalesDate Address YearBuilt ## 1 2015 <NA> 0000 0019 RIVERTON DR0000 1949 ## 2 2015 <NA> 0000 1335 MASONIC AV0000 1900 ## 3 2015 <NA> 0000 1256 28TH AV0000 1941 ## 4 2015 <NA> 0000 0088 ENTRADA CT0000 1927 ## 5 2015 <NA> 0000 0090 MONTCALM ST0000 1938 ## 6 2015 <NA> 0000 0025 GOLETA AV0000 1936 ## NumBedrooms NumBathrooms NumRooms NumStories NumUnits AreaSquareFeet ## 1 0 0 6 1 1 1455 ## 2 0 0 8 2 1 2190 ## 3 0 0 7 1 1 2040 ## 4 0 0 8 2 1 2413 ## 5 0 0 6 1 1 1770 ## 6 0 0 6 1 1 2260 ## ImprovementValue LandValue Neighborhood ## 1 66386 33617 Sunset/Parkside ## 2 61811 38224 Haight Ashbury ## 3 64357 35679 Sunset/Parkside ## 4 62217 37827 West of Twin Peaks ## 5 74076 25975 Bernal Heights ## 6 64182 35870 Sunset/Parkside ## Location SupeDistrict totvalue SalesYear ## 1 (37.7336426217969, -122.487523696156) 7 100003 NA ## 2 (37.7685762370234, -122.445265935163) 5 100035 NA ## 3 (37.7639276775532, -122.486496732295) 4 100036 NA ## 4 (37.7240546670353, -122.468263348608) 7 100044 NA ## 5 (37.7467909269629, -122.406602489544) 9 100051 NA ## 6 (37.7350415121305, -122.483500474243) 4 100052 NA ## lat lon ## 1 37.73364 -122.4875 ## 2 37.76858 -122.4453 ## 3 37.76393 -122.4865 ## 4 37.72405 -122.4683 ## 5 37.74679 -122.4066 ## 6 37.73504 -122.4835
plot(sfhomes$lon, sfhomes$lat)
ggplot2library(ggplot2) ggplot() + geom_point(data=sfhomes, aes(lon,lat), col="red", size=1)
d2015 <- subset(d1, as.numeric(SalesYear) == 2015) nrow(d2015) hist(d1$totvalue) ggplot() + geom_point(data=d2015, aes(x=lon, y=lat, col=totvalue))
ggmap extends ggplotsf_map <- get_map("San Francisco, CA")
ggmap(sf_map) +
geom_point(data=d2015, aes(x=lon, y=lat, col=totvalue))
get_mapsf_ctr <- c(lon = mean(d1$lon), lat = mean(d1$lat)) sf_ctr # take a look sf_basemap <- get_map(sf_ctr, zoom=12, scale=1) ggmap(sf_basemap) + geom_point(data=d2015, aes(x=lon, y=lat, col=totvalue))
# ?get_map to see options
sf_basemap_lite <- get_map(sf_ctr, zoom=12, scale=1,
maptype = "toner-lite", source="stamen")
ggmap(sf_basemap_lite) +
geom_point(data=d2015, aes(x=lon, y=lat, col=totvalue))
# Let's look at last 5 years d2010_16 <- subset(d1, as.numeric(SalesYear) > 2005) ggmap(sf_basemap_lite) + geom_point(aes(lon, lat, col=totvalue), data = d2010_16 ) + facet_wrap(~ SalesYear)
Redo above facet map with the following changes:
There are limits to what you can do with geospatial data stored in a dataframe
and mapping the data with ggplot and ggmap
The most common format for non-point geospatial data.
Can't directly read or plot a shapefile with data frames/ggplot/ggmap
Data cleaning! Before Mapping
Compute spatial metrics and relationships
Data linkages - when data are in same CRS
sp PackageClasses and Methods for Spatial Data
The SP package is most commonly used to construct and manipulate spatial data objects in R.
Hundreds of other R packages that do things with spatial data typically build on SP objects.
sf packageThe sf, or simple features package in R has many improvements
Based on open standards for specifying spatial data
But most spatial packages still depend on sp
So, live on the bleeding edge or check back in a year or so.
sp packagelibrary(sp)
getClass("Spatial") # See all sp object types
## Class "Spatial" [package "sp"] ## ## Slots: ## ## Name: bbox proj4string ## Class: matrix CRS ## ## Known Subclasses: ## Class "SpatialPoints", directly ## Class "SpatialMultiPoints", directly ## Class "SpatialGrid", directly ## Class "SpatialLines", directly ## Class "SpatialPolygons", directly ## Class "SpatialPointsDataFrame", by class "SpatialPoints", distance 2 ## Class "SpatialPixels", by class "SpatialPoints", distance 2 ## Class "SpatialMultiPointsDataFrame", by class "SpatialMultiPoints", distance 2 ## Class "SpatialGridDataFrame", by class "SpatialGrid", distance 2 ## Class "SpatialLinesDataFrame", by class "SpatialLines", distance 2 ## Class "SpatialPixelsDataFrame", by class "SpatialPoints", distance 3 ## Class "SpatialPolygonsDataFrame", by class "SpatialPolygons", distance 2
```
| Geometry | Spatial Object | Spatial Object with Attributes |
|---|---|---|
| Points | SpatialPoints | SpatialPointsDataFrame |
| Lines | SpatialLines | SpatialLinesDataFrame< |
| Polygons | SpatialPolygons | SpatialPolygonsDataFrame |
We use the S*DF objects most frequently!
Airbnb Rental data from http://insideairbnb.com
rentals <- read.csv('data/sf_airbnb_2bds.csv')
class(rentals)
## [1] "data.frame"
dim(rentals)
## [1] 1206 14
str(rentals)
## 'data.frame': 1206 obs. of 14 variables: ## $ id : int 91957 18139835 172943 2309140 2559297 149108 10832295 15134083 283235 8013423 ... ## $ name : Factor w/ 1203 levels " Victorian Suite in Inner Mission",..: 1091 267 917 1128 479 904 349 696 1064 976 ... ## $ latitude : num 37.8 37.8 37.8 37.8 37.8 ... ## $ longitude : num -122 -122 -122 -122 -122 ... ## $ property_type : Factor w/ 4 levels "Apartment","Condominium",..: 1 1 3 1 1 3 2 1 1 1 ... ## $ room_type : Factor w/ 1 level "Entire home/apt": 1 1 1 1 1 1 1 1 1 1 ... ## $ accommodates : int 6 4 3 4 4 3 6 4 4 4 ... ## $ bathrooms : num 1.5 1 1 1 1 1 1 1 1 2 ... ## $ bedrooms : int 2 2 2 2 2 2 2 2 2 2 ... ## $ beds : int 4 2 2 2 2 2 2 2 2 2 ... ## $ price : int 300 275 279 395 235 300 450 349 199 235 ... ## $ review_scores_rating: int 94 100 100 97 86 100 100 92 96 93 ... ## $ neighbourhood : Factor w/ 52 levels "Alamo Square",..: 7 52 7 52 52 7 52 26 52 20 ... ## $ listing_url : Factor w/ 1206 levels "https://www.airbnb.com/rooms/1003756",..: 1156 490 446 666 689 293 55 311 711 1073 ...
head(rentals)
## id name latitude longitude ## 1 91957 Sunny SF Flat with 5*s 37.76194 -122.4493 ## 2 18139835 Brand New Apartment in the Heart of NOPA! 37.77568 -122.4453 ## 3 172943 Redwood Cottage on Ashbury Heights 37.76022 -122.4495 ## 4 2309140 The Lady Di 37.77524 -122.4394 ## 5 2559297 Entire 2BR w/pkg centrally-located! 37.77507 -122.4409 ## 6 149108 QUIET LUXURY COLE VALLEY PARKING 37.76101 -122.4507 ## property_type room_type accommodates bathrooms bedrooms beds price ## 1 Apartment Entire home/apt 6 1.5 2 4 300 ## 2 Apartment Entire home/apt 4 1.0 2 2 275 ## 3 House Entire home/apt 3 1.0 2 2 279 ## 4 Apartment Entire home/apt 4 1.0 2 2 395 ## 5 Apartment Entire home/apt 4 1.0 2 2 235 ## 6 House Entire home/apt 3 1.0 2 2 300 ## review_scores_rating neighbourhood ## 1 94 Cole Valley ## 2 100 Western Addition/NOPA ## 3 100 Cole Valley ## 4 97 Western Addition/NOPA ## 5 86 Western Addition/NOPA ## 6 100 Cole Valley ## listing_url ## 1 https://www.airbnb.com/rooms/91957 ## 2 https://www.airbnb.com/rooms/18139835 ## 3 https://www.airbnb.com/rooms/172943 ## 4 https://www.airbnb.com/rooms/2309140 ## 5 https://www.airbnb.com/rooms/2559297 ## 6 https://www.airbnb.com/rooms/149108
hist(rentals$price)
cheap <- subset(rentals, price < 401) hist(cheap$price)
cheap_good <- subset(cheap, review_scores_rating > 98) hist(cheap$price)
Use the sp::coordinates() method
Requires a vector indicating the x,y columns
SP will create a SpatialPointsData Fram from csv
#First make a copy
cheap_good_orig <- cheap_good
coordinates(cheap_good) <- c('longitude','latitude')
class(cheap_good)
## [1] "SpatialPointsDataFrame" ## attr(,"package") ## [1] "sp"
str(cheap_good_orig)
## 'data.frame': 374 obs. of 14 variables: ## $ id : int 18139835 172943 149108 10720392 14594885 15546103 13516315 10058568 1721354 3585034 ... ## $ name : Factor w/ 1203 levels " Victorian Suite in Inner Mission",..: 267 917 904 504 448 323 948 314 519 389 ... ## $ latitude : num 37.8 37.8 37.8 37.8 37.8 ... ## $ longitude : num -122 -122 -122 -122 -122 ... ## $ property_type : Factor w/ 4 levels "Apartment","Condominium",..: 1 3 3 1 2 3 1 1 3 1 ... ## $ room_type : Factor w/ 1 level "Entire home/apt": 1 1 1 1 1 1 1 1 1 1 ... ## $ accommodates : int 4 3 3 6 4 4 4 4 4 4 ... ## $ bathrooms : num 1 1 1 1.5 2 1 2 2 1 1.5 ... ## $ bedrooms : int 2 2 2 2 2 2 2 2 2 2 ... ## $ beds : int 2 2 2 2 2 2 3 3 2 2 ... ## $ price : int 275 279 300 250 200 225 350 215 389 200 ... ## $ review_scores_rating: int 100 100 100 100 99 100 100 100 100 100 ... ## $ neighbourhood : Factor w/ 52 levels "Alamo Square",..: 52 7 7 20 52 20 7 7 20 7 ... ## $ listing_url : Factor w/ 1206 levels "https://www.airbnb.com/rooms/1003756",..: 490 446 293 41 268 336 200 3 439 776 ...
str(cheap_good)

You can see from str(cheap_good) that a SPDF object is a collection of slots or components. The key ones are:
@data data frame of attributes that describe each location@coords the coordinates for each location@bbox the min and max bounding coordinates@proj4string the coordinate reference system defintion as a stringReview the output of each of these:
summary(cheap_good) head(cheap_good@coords) head(cheap_good@data) cheap_good@bbox cheap_good@proj4string
Are all the columns that were present in the DF now in the SPDF?
Is there a slot without data?
cheap_good@proj4string # get a CRS object
## CRS arguments: NA
Defining the CRS of a spatial data object enables the software to locate the coordinates in a specific space.
Geospatial-aware software will have definitions for supported Earth referenced coordinate systems
For example…
You need to know
Known as defining a projection in ArcGIS
Defining a CRS != Transforming CRS
We will get to transformations soon!
Need the proj4 string or code that defines the CRS
proj4 is a widely used library of CRS definitions and functions.
sp includes the CRS() function for creating a CRS object
# Create a CRS object from its proj4 string
WGS84_CRS <- CRS("+proj=longlat +datum=WGS84 +no_defs
+ellps=WGS84 +towgs84=0,0,0")
# Set the CRS of the SPDF
proj4string(cheap_good) <- WGS84_CRS
# check it
cheap_good@proj4string
## CRS arguments: ## +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
# or use the EPSG code directly
proj4string(cheap_good) <- CRS("+init=epsg:4326")
# or enter the string
proj4string(cheap_good) <- CRS("+proj=longlat
+ellps=WGS84 +datum=WGS84 +no_defs")
What happens if we assign the wrong CRS?
or not at all?
Software doesn't care but you need to!
# 4326 is the code for WGS84
proj4string(cheap_good) <- CRS("+init=epsg:4326")
See http://spatialreference.org/
Use this site to find EPSG codes and proj4 CRS strings
Because I can't remember a CRS string but I can remember ~10 codes
Geographic
4326 Geographic, WGS84 (default for lon/lat)
4269 Geographic, NAD83 (USA Fed agencies like Census)
Projected
5070 USA Contiguous Albers Equal Area Conic
3310 CA ALbers Equal Area
26910 UTM Zone 10, NAD83 (Northern Cal)
3857 Web Mercator (web maps)
Use http://spatialreference.org/ to make an educated guess as to the CRS of these coordinates:
X = 549228.058249, Y = 4176578.444299
Strategy:
What are the units for that CRS?
You can use the standard R plot command to make a basic map of one or more sp objects.
plot(cheap_good)
plot optionsYou can enhance the basic plot with custom graphical parameters.
See: see the Quick-R website's page for graphical parameters.
plot(cheap_good, col="red", bg="lightblue", pch=21, cex=2)
sp includes a plotting method spplot
Use it to create quick maps
Or to create great maps
But complex, non-intuitive syntax = long ugly code
See examples in sp Gallery: Plotting maps with sp
spplot of the rental data by pricespplot(cheap_good,"price")
Use spplot to create data maps from some of the other columns in the @data slot
Getting help:
?spplot
spplot(cheap_good,"bathrooms") spplot(cheap_good,"accommodates") spplot(cheap_good,"property_type") spplot(cheap_good,"neighbourhood")
Vector points, lines & polygons:
Raster grids
See GIS file formats
This is one of the most, if not the most common spatial vector data file formats.
Old but everywhere!
Gotchas: 2GB limit, 10 char column names, poor unicode support
There's an R package for that!
rgdalrgdal is an R port of the powerful and widely used GDAL library.
It is the most commonly used R library for importing and exporting spatial data.
OGR: for vector data: readOGR() and writeOGR()
GDAL for raster data: readGDAL() and writeGDAL()
rgdallibrary(rgdal)
## Warning: package 'rgdal' was built under R version 3.4.3
## rgdal: version: 1.2-16, (SVN revision 701) ## Geospatial Data Abstraction Library extensions to R successfully loaded ## Loaded GDAL runtime: GDAL 2.1.3, released 2017/20/01 ## Path to GDAL shared files: /Library/Frameworks/R.framework/Versions/3.4/Resources/library/rgdal/gdal ## GDAL binary built with GEOS: FALSE ## Loaded PROJ.4 runtime: Rel. 4.9.3, 15 August 2016, [PJ_VERSION: 493] ## Path to PROJ.4 shared files: /Library/Frameworks/R.framework/Versions/3.4/Resources/library/rgdal/proj ## Linking to sp version: 1.2-5
# See what file types are supported by rgdal drivers # ogrDrivers()$name
gdal.org
`?readOGR
For more info on working with rgdal to load different types of spatial data in R see this excellent tutorial by Zev Ross.
sfboundary <- readOGR(dsn="data",layer="sf_boundary")
## OGR data source with driver: ESRI Shapefile ## Source: "data", layer: "sf_boundary" ## with 1 features ## It has 3 fields
sfboundary <- readOGR(dsn="data",layer="sf_boundary")
## OGR data source with driver: ESRI Shapefile ## Source: "data", layer: "sf_boundary" ## with 1 features ## It has 3 fields
# or
# sfboundary <- readOGR("data","sf_boundary")
# but not
#sfboundary <- readOGR(dsn="data/",layer="sf_boundary")
str(sfboundary) summary(sfboundary)
How?
plot(sfboundary)
How?
head(sfboundary@data)
## NAME Pop2010 Land_sqmi ## 0 San Francisco 805235 46.87
How?
Is it defined? Geographic or projected CRS?
sfboundary@proj4string
## CRS arguments: ## +proj=utm +zone=10 +datum=NAD83 +units=m +no_defs +ellps=GRS80 ## +towgs84=0,0,0
plot(sfboundary) points(cheap_good, col="red")
Where are the points?
plot(sfboundary) points(cheap_good, col="red")
Compare the CRSs, are they the same?
proj4string(sfboundary) proj4string(cheap_good) proj4string(sfboundary) == proj4string(cheap_good)
proj4string(sfboundary)
## [1] "+proj=utm +zone=10 +datum=NAD83 +units=m +no_defs +ellps=GRS80 +towgs84=0,0,0"
proj4string(cheap_good)
## [1] "+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0"
proj4string(sfboundary) == proj4string(cheap_good)
## [1] FALSE
sfboundary@bbox
## min max ## x 542696.6 556659.9 ## y 4173563.7 4185088.6
cheap_good@bbox
## min max ## longitude -122.50647 -122.38947 ## latitude 37.70738 37.80646
The #1 reason…

All geospatial data should have the same CRS.
When they don't, they need to be transformed to a common CRS.
This is also called a projection transformation,
projecting or reprojection.Use sp function spTransform
Requires as input:
Outputs a new spatial object with coordinate data in the target CRS
sfboundarysf_lonlat <- spTransform(sfboundary, WGS84_CRS)
How will we know?
proj4string(cheap_good) == proj4string(sf_lonlat)
plot(sf_lonlat) points(cheap_good, col="red") points(cheap_good[cheap_good$price<100,], col="green", pch=19)
Use CRS to that of another data layer
sf_lonlat <- spTransform(sfboundary, CRS(proj4string(cheap_good)))
Use CRS string
sf_lonlat <- spTransform(sfboundary, CRS("+proj=longlat +ellps=WGS84
+datum=WGS84 +no_defs"))
USE CRS code
sf_lonlat <- spTransform(sfboundary, CRS("+init=epsg:4326"))
Use a CRS object
WGS84_CRS <- CRS(proj4string(cheap_good)) sf_lonlat <- spTransform(sfboundary, WGS84_CRS)
Use writeOGR to save sf_lonlat to a new shapefile
See ?writeOGR for help
# write transformed data to a new shapefile
writeOGR(sf_lonlat, dsn = "data", layer = "sf_bounary_geo",
driver="ESRI Shapefile")
# is it there?
dir("data")
We want all data in the same CRS
Which one is best?
![]()
Let's read in some line data
In the popular GeoJSON file format

sf_streets <- readOGR(dsn='data/sf_highways.geojson', layer="OGRGeoJSON")
## OGR data source with driver: GeoJSON ## Source: "data/sf_highways.geojson", layer: "OGRGeoJSON" ## with 246 features ## It has 5 fields
WARNING: Reading & writing GeoJSON can be tricky!
plot(sf_streets)
str(sf_streets) summary(sf_streets)
Plot all the data on one map
How do we do that?
Recall our earlier examples.
plot(sf_lonlat) lines(sf_streets) points(cheap_good, col="red") points(cheap_good[cheap_good$price<200,], col="green")
Layers draw in order added to plot.
The GIS Layer Cake
Should be: Rasters > Polygons > Lines > Points
Order can also matter for Features!
plot(cheap_good, col="red") lines(sf_streets) plot(sf_lonlat, add=TRUE) # note syntax
coordinates() - data frame to SPDFrgdal::readOGR and writeOGRCreated Spatial*DataFrames
Defined CRS with proj4string()
Transformed CRS with spTransform()
Mapped data with plot() and spplot
Mapped multiple geospatial data layers
Terminology
Also called thematic maps or data maps
Use data values to determine symbology
Use symbology to convey data values / meaning
Important for all phases of the research process, from exploratory analysis through reporting results.
Lots of them
Let's quickly discuss the most common ones
plotPros
Cons
spplotPros
Cons
ggplot2 and ggmapPros
ggplot2ggplot2ggmapCons
sp objectstmapPros
sp objects & CRSsggplot2static or interactive tmapsCons
leafletPros
R port of the popular Javascript library
Allows one to create interactive maps that can be served on web
Highly customizeable!
Cons
tmap stands for thematic map
Great maps with less code than the alternatives
Syntax should be familar to ggplot2 users, but simpler
Good starting points
Load the library
library(tmap)
## Warning: package 'tmap' was built under R version 3.4.3
Let's explore population data for US states.
Data are in the file data/us_states_pop.shp
Challenge
Use readOGR to load it and plot to view it
uspop <- readOGR("./data", "us_states_pop")
## OGR data source with driver: ESRI Shapefile ## Source: "./data", layer: "us_states_pop" ## with 49 features ## It has 4 fields
plot(uspop)
Review the Data with commands we used earlier
What type of data object is usopen
How many features does it contain?
How many attributes describe those features?
What is the CRS?
qtmMake a quick plot with default options
qtm(uspop)
qtm of a data valueqtm(uspop,"POPULATION")
tmap Shapes and Graphic Elementstmap's flexibility comes in how it intitively allows you to layer spatial data and style the layers by data attributes
Use tm_shape(<sp_object>) to specifiy a geospatial data layer
Add + tm_<element>(...) to style the layer by data values
…and other options for creating a publication ready map
tmap functionality?tm_polygons

tm_shape(uspop) + tm_polygons(col="beige", border.col = "red")
tm_shape(uspop) + tm_polygons(col="#f6e6a2", border.col = "white")
Create a tmap of Population with different fill and outline colors
You can use color names or HEX values
See ?tm_polygons and try customizing another parameter
tm_shape(uspop) + tm_polygons(col="#f6e6a2", border.col = "black", alpha=0.5, lwd=0.5)
Notice anything odd about shape of USA?
tm_shape(uspop) + tm_polygons(col="#f6e6a2", border.col = "white")
Setting the CRS Dynamically
tm_shape(uspop, projection="+init=epsg:5070") + tm_polygons(col="#f6e6a2", border.col = "white")
Also called On-the-fly reprojection in ArcGIS & QGIS
Very cool!
BUT, if you want to use data in a different CRS it is best to transform it.
Transform the uspop data to the USA Contiguous Albers CRS (5070),
Save output as a new SpatialPolygonsDataFrame called uspop_5070
uspop_5070 <- spTransform(uspop, CRS("+init=epsg:5070"))
tm_shape(uspop_5070) + tm_polygons(col="#f6e6a2", border.col = "white")
tm_shape(uspop_5070) + tm_polygons(col="#f6e6a2", border.col="#f6e6a2") + tm_shape(uspop) + tm_borders(col="purple")
Color areas by data values
Fun name for a type of data map
Sometimes called heatmap
tm_shape(uspop_5070) + tm_polygons(col="POPULATION")
Color Palettes
Data Classification
The R package RColorBrewer is widely used to select color palettes for maps.
Read the package help for more info.
?RColorBrewer or see colorbrewer.org
library(RColorBrewer)
Use display.brewer functions to see available color palettes.
display.brewer.all()
Complementary colors, e.g. pastels, to emphasize different categories not magnitudes or order.
display.brewer.all(type="qual")
Light to dark shades of the same or complimentary colors to imply order, e.g. higher to lower ranks or values
display.brewer.all(type="seq")
Two contrasting sequential color palettes at either end to emphasize outliers with a lighter palette in the middle to reveal trends.
display.brewer.all(type="div")
tm_shape(uspop_5070) + tm_polygons(col="POPULATION", palette="BuPu")
Try a sequential, divergent and qualitative color palette with this dataset.
Set auto.palette.mapping=FALSE and TRUE with the SPECTRAL palette.
?tm_polygonstm_shape(uspop_5070) + tm_polygons(col="POPULATION", palette="Spectral",
auto.palette.mapping=FALSE)
Unclassified maps scale full range of values to color palette
Great for exploring trends and outliers,
but hard to interpret specific data values.
The eye can only differentiate a few colors.
Important for improving the cartographic display of continuous data
Reduces complexity by grouping continuous values into a small number of bins, typically 5-7.
Unique symbology - color - is then associated with each bin.
A legend indicates the association, making it easier to interpret data values.
Common methods for binning data into a set of classes include:
equal interval: classes the data into bins of equal data ranges, e.g. 10-20, 20-30, 30-40.
quantile: classes the data into bins with an equal number of data observations. This is the default for most mapping software.
fisher/jenks/natural breaks: classes data into bins that minmizie within group variance and maximize between group variance.
standard devation: classes emphasize outliers by classing data into bins based on standard deviations around the mean, eg -2 to + 2 SDs.
tmaptmap uses the classificaiton methods in the classIntervals package
The class method is set by the style parameter in tm_polygons - or other style element, eg tm_symbols
See ?tm_polygons for keyword options
tmap data classification - Quantilestm_shape(uspop_5070) + tm_polygons(col="POPULATION", style="quantile")
tmap data classification - Jenkstm_shape(uspop_5070) + tm_polygons(col="POPULATION", style="jenks")
Try a different classification scheme.
Note how it impacts the display of the data and thus communicate different messages
What is the default classification scheme used by tmap?
In addition to setting the classification method you can set the number of classes, or cuts.
By default this is 5
Explore setting it to 3 or 9 with n=
tm_shape(uspop_5070) + tm_polygons(col="POPULATION", palette="BuPu",
style="jenks", n=9)
The Art and Science of Map Making
Science:
Art:
Map popdens instead of POPULATION
Don't use defaults
tm_shape(uspop_5070) + tm_polygons(col=c("POPULATION","popdens"),
style=c("jenks","jenks"))
Sometimes polygons get in the way of what you want to communicate. Why?
What state has the greatest population density?
tm_shape(uspop_5070) + tm_symbols(col="popdens", style="jenks")
Use tm_symbols to map the data as points
tm_shape(uspop_5070) + tm_symbols(col="popdens", style="jenks")
tm_shape(uspop_5070) + tm_polygons(col="white", border.alpha = 0.5) + tm_shape(uspop_5070) + tm_symbols(col="popdens", style="jenks")
Graduated Color Map when data are classified
Proportional Color Map when they are not
Change the marker shape to square
We have using color to communicate data values.
We can also use size.
Proportional and Graduated Symbol maps vary symbol size by data value
What changed to make a symbol not color map?
tm_shape(uspop_5070) + tm_polygons(col="white", border.alpha = 0.5) + tm_shape(uspop_5070) + tm_symbols(size="popdens", style="jenks")
Take a look at ?tm_symbols and adjust the graduated symbol map
Change the outline and fill colors of the point symbols and add fill transparency.
Also customize the legend title
tm_shape(uspop_5070) + tm_polygons(col="white", border.alpha = 0.5) +
tm_shape(uspop_5070) + tm_symbols(size="popdens", style="sd",
size.lim=c(1,500), col="purple", alpha=0.5, border.col="grey",
title.size="Population Density (km2)")

tmaptm_shape(uspop_5070 ) + tm_polygons("POPULATION") +
tm_scale_bar(position=c("left","bottom")) +
tm_compass(position=c("right","center")) +
tm_style_classic(title="Patty's Map",
title.position=c("center","top"), inner.margins=c(.05,.05, .15, .25))
tmaptmap has two modes: plot and view
The plot mode is the default and is for creating static maps.
The view mode is for creating interactive maps using the leaflet Javascript library.
You can switch back and forth with the tmap_mode() function.
Same syntax for static and interactive tmaps!
tmap_mode("view")
tm_shape(uspop_5070) + tm_symbols(size="popdens", style="sd",
col="purple", border,col="black", alpha=0.5)
?tmap_mode
tmap_mode("view")
tm_shape(uspop_5070) + tm_symbols(size="popdens", style="sd",
col="purple", border.col="black", alpha=0.5,
popup.vars=c("NAME","POPULATION","popdens"))
Return to plot mode
tmap_mode('plot')
## tmap mode set to plotting
tmap Choropleth MapsLegend auto added to tmap interactive choropleth maps
mymap <- tm_shape(uspop_5070) +
tm_polygons("popdens", style="jenks", title="Population Density")
tm_view(mymap)
Make a graduted color map of SF Airbnb rentals (cheap_good)
Make an interacive map
tm_shape(sf_lonlat) +
tm_polygons(col="beige", border.col = "blue") +
tm_shape(sf_streets) +
tm_lines(col="black", lwd = 3) +
tm_shape(sf_streets) +
tm_lines(col="white", lwd = 1) +
tm_shape(cheap_good) +
tm_symbols(col="red", size=0.5, alpha=0.5, style="jenks")
tmap_mode("view")
tm_shape(sf_lonlat) +
tm_polygons(col="beige", border.col = "blue") +
tm_shape(sf_streets) +
tm_lines(col="black", lwd = 3) +
tm_shape(sf_streets) +
tm_lines(col="white", lwd = 1) +
tm_shape(cheap_good) +
tm_symbols(col="red", size = 0.5, alpha=0.5, style="jenks",
popup.vars=c("name","price","listing_url"))
tmap_mode("plot")
Mapping categorical (or qualitative) data with
tm_shape(sfboundary) + tm_polygons(col="beige") + tm_shape(cheap_good) + tm_symbols(shape="property_type", size=0.5)
bigmap <- tm_shape(sfboundary) + tm_polygons(col="beige") +
tm_shape(cheap_good) +
tm_symbols(size="accommodates", title.size="Accomodates", col="price",
title.col="Price", shape="property_type", title.shape="Property Type") +
tm_layout( legend.bg.color="white",inner.margins=c(.05,.05, .15, .25),
title="Airbnb 2 Bedroom Rentals, San Francisco Fall 2017",
legend.position=c("right","center"))
bigmap
save_tmapYou can save static and interactive maps with tm_save
See: ?save_tmap for details
map1 <- tm_shape(uspop_5070) + tm_polygons(col="POPULATION", style="jenks") save_tmap(map1, "tmap_choropleth.png", height=4) # Static image file save_tmap(map1, "tmap_choropleth.html") # interactive web map
tmapSee vignette("tmap-modes") for more on interactive maps.
tmap github repo (readme file, demo and example folders)
leaflet PackageThere are several R packages that output leaflet maps.
Use the leaflet package for more customized leaflet maps
Highly recommend if you want to make interactive maps.
See the RStudio Leaflet tutorial.
library(leaflet)
leaflet(cheap_good) %>% addTiles() %>%
addCircleMarkers(data = cheap_good, radius = 5, stroke=F,
color = "purple", fillOpacity = 0.75
)
pal <- colorQuantile("Reds",NULL,5)
leaflet(cheap_good) %>% addTiles() %>%
addCircleMarkers(
data = cheap_good,
radius = 6,
color = ~pal(price),
stroke = F,
fillOpacity = 0.75
)
popup_content <- cheap_good$name
popup_content <- paste0(popup_content, "<br>Price per night: $", cheap_good$price)
popup_content <- paste0(popup_content, "<br><a href=",cheap_good$listing_url,">More info...</a>")
leaflet(cheap_good) %>% addTiles() %>%
addCircleMarkers(
data = cheap_good,
radius = 6,
color = ~pal(price),
stroke = F,
fillOpacity = 0.75,
popup = popup_content)
Raster data
Geoprocessing
Spatial Analysis
Next week, Part II
Think about it
Try it and see
Check the help ?spDist
coit_tower <- c("-122.405837,37.802032")
cheap_good$coit_dist <-
spDistsN1(cheap_good,c(-122.405837,37.802032), longlat = T)
head(cheap_good@data)
sfsessionInfo()
## R version 3.4.1 (2017-06-30) ## Platform: x86_64-apple-darwin15.6.0 (64-bit) ## Running under: macOS Sierra 10.12.6 ## ## Matrix products: default ## BLAS: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRblas.0.dylib ## LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib ## ## locale: ## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 ## ## attached base packages: ## [1] stats graphics grDevices utils datasets methods base ## ## other attached packages: ## [1] RColorBrewer_1.1-2 tmap_1.11 rgdal_1.2-16 ## [4] sp_1.2-5 ## ## loaded via a namespace (and not attached): ## [1] viridisLite_0.2.0 jsonlite_1.5 splines_3.4.1 ## [4] geojsonlint_0.2.0 foreach_1.4.4 R.utils_2.6.0 ## [7] gtools_3.5.0 shiny_1.0.5 expm_0.999-2 ## [10] stats4_3.4.1 yaml_2.1.16 LearnBayes_2.15 ## [13] backports_1.1.2 lattice_0.20-35 digest_0.6.13 ## [16] colorspace_1.3-2 htmltools_0.3.6 httpuv_1.3.5 ## [19] Matrix_1.2-12 R.oo_1.21.0 plyr_1.8.4 ## [22] XML_3.98-1.9 rmapshaper_0.3.0 raster_2.6-7 ## [25] gmodels_2.16.2 xtable_1.8-2 webshot_0.5.0 ## [28] scales_0.5.0 gdata_2.18.0 satellite_1.0.1 ## [31] gdalUtils_2.0.1.7 mapview_2.2.0 magrittr_1.5 ## [34] mime_0.5 deldir_0.1-14 evaluate_0.10.1 ## [37] R.methodsS3_1.7.1 nlme_3.1-131 MASS_7.3-48 ## [40] class_7.3-14 tools_3.4.1 geosphere_1.5-7 ## [43] stringr_1.2.0 V8_1.5 munsell_0.4.3 ## [46] compiler_3.4.1 e1071_1.6-8 units_0.4-6 ## [49] classInt_0.1-24 grid_3.4.1 tmaptools_1.2-2 ## [52] RCurl_1.95-4.10 dichromat_2.0-0 iterators_1.0.9 ## [55] htmlwidgets_0.9 crosstalk_1.0.0 bitops_1.0-6 ## [58] base64enc_0.1-3 rmarkdown_1.8 boot_1.3-20 ## [61] codetools_0.2-15 DBI_0.7 jsonvalidate_1.0.0 ## [64] curl_3.1 R6_2.2.2 knitr_1.17 ## [67] udunits2_0.13 rgeos_0.3-26 rprojroot_1.3-1 ## [70] spdep_0.7-4 KernSmooth_2.23-15 stringi_1.1.6 ## [73] osmar_1.1-7 Rcpp_0.12.14 sf_0.5-5 ## [76] png_0.1-7 leaflet_1.1.0 spData_0.2.6.9 ## [79] coda_0.19-1
library(knitr)
purl("r-geospatial-workshop-f2017.Rmd", output = "scripts/r-geospatial-workshop-f2017.r", documentation = 1)